The Key Approach to Translation: Word Alignment Models
نویسنده
چکیده
This paper focuses on a key aspect of Statistical Machine Translation: word alignment. Various word alignment models are presented, first differentiating between methods and then highlighting the preferred method. A partially detailed mathematical explanation is provided for each model as well as a brief implementation of the Expectation Maximization Algorithm (EM Algorithm) for later models. Furthermore, statistical and error analysis follow each segment of models. The purpose of this paper is to show an integral sub problem that Statistical Machine Translation must deal with and how some computational linguists and computer scientists go about doing it. General Terms EM Algorithm, Statistical Machine Translation (SMT)
منابع مشابه
Improving Domain-Specific Word Alignment for Computer Assisted Translation
This paper proposes an approach to improve word alignment in a specific domain, in which only a small-scale domain-specific corpus is available, by adapting the word alignment information in the general domain to the specific domain. This approach first trains two statistical word alignment models with the large-scale corpus in the general domain and the small-scale corpus in the specific domai...
متن کاملStatistical machine translation: from single word models to alignment templates
In this work, new approaches for machine translation using statistical methods are described. In addition to the standard source-channel approach to statistical machine translation, a more general approach based on the maximum entropy principle is presented. Various methods for computing single-word alignments using statistical or heuristic models are described. Various smoothing techniques, me...
متن کاملAssociation-Based Bilingual Word Alignment
Bilingual word alignment forms the foundation of current work on statistical machine translation. Standard wordalignment methods involve the use of probabilistic generative models that are complex to implement and slow to train. In this paper we show that it is possible to approach the alignment accuracy of the standard models using algorithms that are much faster, and in some ways simpler, bas...
متن کاملA Discriminative Framework for Bilingual Word Alignment
Bilingual word alignment forms the foundation of most approaches to statistical machine translation. Current word alignment methods are predominantly based on generative models. In this paper, we demonstrate a discriminative approach to training simple word alignment models that are comparable in accuracy to the more complex generative models normally used. These models have the the advantages ...
متن کاملBuilding and Using Parallel Texts: Data-Driven Machine Translation and Beyond
Bilingual word alignment forms the foun-dation of current work on statisticalmachine translation. Standard word-alignment methods involve the use ofprobabilistic generative models that arecomplex to implement and slow to train.In this paper we show that it is possibleto approach the alignment accuracy of thestandard models using algorithms that aremuch faster...
متن کامل